Analyzing IO Amplification in Linux File Systems

نویسندگان

  • Jayashree Mohan
  • Rohan Kadekodi
  • Vijay Chidambaram
چکیده

We present the first systematic analysis of read, write, and space amplification in Linux file systems. While many researchers are tackling write amplification in keyvalue stores, IO amplification in file systems has been largely unexplored. We analyze data and metadata operations on five widely-used Linux file systems: ext2, ext4, XFS, btrfs, and F2FS. We find that data operations result in significant write amplification (2–32×) and that metadata operations have a large IO cost. For example, a single rename requires 648 KB write IO in btrfs. We also find that small random reads result in read amplification of 2–13×. Based on these observations, we present the CReWS conjecture about the relationship between IO amplification, consistency, and storage space utilization. We hope this paper spurs people to design future file systems with less IO amplification, especially for non-volatile memory technologies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Priority IO Scheduling in the Cloud

Current state of the art runtime systems, built for managing cloud environments, almost always assume resource sharing among multiple users and applications. In large part, these runtime systems rely on functionalities of the node-local operating systems to divide the local resources among the applications that share a node. While OSes usually achieve good resource sharing by creating distinct ...

متن کامل

An IO Scheduling Algorithm to Improve Performance of Flash-Based Solid State Disks

Since the emergence of solid state devices into the storage scene, improvements in capacity and price have brought them to the point where they are becoming a viable alternative to traditional magnetic storage media. Current file systems and device-level I/O schedulers are optimized for rotational magnetic hard disk drives. In order to improve the efficiency of hard disk utilization, an Operati...

متن کامل

Scalability of Transient CFD on Large-Scale Linux Clusters with Parallel File Systems

This work examines the parallel scalability characteristics of commercial CFD software FLUENT and STAR-CD for up to 256 processing cores, and research CFD software CDP from Stanford University for up to 512 cores – for transient CFD simulations that heavy in IO relative to numerical operations. In three independent studies conducted with engineering contributions from the University of Cambridg...

متن کامل

Linux 2.6 IO Performance Analysis, Quantification, and Optimization

This paper presents a novel taxonomy that characterizes in a structured and pragmatic manner the interrelationships and tradeoffs of the (rather complex) Linux 2.6 IO stack. The focus is on elaborating on the tools and techniques available in Linux 2.6 to analyze, quantify, and optimize workload-dependent IO performance. The argument made is that only a detailed, layered analysis of the Linux 2...

متن کامل

PVFS: A Parallel File System for Linux Clusters

As Linux clusters have matured as platforms for lowcost, high-performance parallel computing, software packages to provide many key services have emerged, especially in areas such as message passing and networking. One area devoid of support, however, has been parallel file systems, which are critical for highperformance I/O on such clusters. We have developed a parallel file system for Linux c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1707.08514  شماره 

صفحات  -

تاریخ انتشار 2017